120 research outputs found
Theory reconstruction: a representation learning view on predicate invention
With this positional paper we present a representation learning view on
predicate invention. The intention of this proposal is to bridge the relational
and deep learning communities on the problem of predicate invention. We propose
a theory reconstruction approach, a formalism that extends autoencoder approach
to representation learning to the relational settings. Our intention is to
start a discussion to define a unifying framework for predicate invention and
theory revision.Comment: 3 pages, StaRAI 2016 submissio
SAIFE: Unsupervised Wireless Spectrum Anomaly Detection with Interpretable Features
Detecting anomalous behavior in wireless spectrum is a demanding task due to
the sheer complexity of the electromagnetic spectrum use. Wireless spectrum
anomalies can take a wide range of forms from the presence of an unwanted
signal in a licensed band to the absence of an expected signal, which makes
manual labeling of anomalies difficult and suboptimal. We present, Spectrum
Anomaly Detector with Interpretable FEatures (SAIFE), an Adversarial
Autoencoder (AAE) based anomaly detector for wireless spectrum anomaly
detection using Power Spectral Density (PSD) data which achieves good anomaly
detection and localization in an unsupervised setting. In addition, we
investigate the model's capabilities to learn interpretable features such as
signal bandwidth, class and center frequency in a semi-supervised fashion.
Along with anomaly detection the model exhibits promising results for lossy PSD
data compression up to 120X and semisupervised signal classification accuracy
close to 100% on three datasets just using 20% labeled samples. Finally the
model is tested on data from one of the distributed Electrosense sensors over a
long term of 500 hours showing its anomaly detection capabilities.Comment: Copyright IEEE, Accepted for DySPAN 201
Learning Relational Representations with Auto-encoding Logic Programs
Deep learning methods capable of handling relational data have proliferated
over the last years. In contrast to traditional relational learning methods
that leverage first-order logic for representing such data, these deep learning
methods aim at re-representing symbolic relational data in Euclidean spaces.
They offer better scalability, but can only numerically approximate relational
structures and are less flexible in terms of reasoning tasks supported. This
paper introduces a novel framework for relational representation learning that
combines the best of both worlds. This framework, inspired by the auto-encoding
principle, uses first-order logic as a data representation language, and the
mapping between the original and latent representation is done by means of
logic programs instead of neural networks. We show how learning can be cast as
a constraint optimisation problem for which existing solvers can be used. The
use of logic as a representation language makes the proposed framework more
accurate (as the representation is exact, rather than approximate), more
flexible, and more interpretable than deep learning methods. We experimentally
show that these latent representations are indeed beneficial in relational
learning tasks.Comment: 8 pages,4 figures, paper + supplement, published at IJCA
COBRAS-TS: A new approach to Semi-Supervised Clustering of Time Series
Clustering is ubiquitous in data analysis, including analysis of time series.
It is inherently subjective: different users may prefer different clusterings
for a particular dataset. Semi-supervised clustering addresses this by allowing
the user to provide examples of instances that should (not) be in the same
cluster. This paper studies semi-supervised clustering in the context of time
series. We show that COBRAS, a state-of-the-art semi-supervised clustering
method, can be adapted to this setting. We refer to this approach as COBRAS-TS.
An extensive experimental evaluation supports the following claims: (1)
COBRAS-TS far outperforms the current state of the art in semi-supervised
clustering for time series, and thus presents a new baseline for the field; (2)
COBRAS-TS can identify clusters with separated components; (3) COBRAS-TS can
identify clusters that are characterized by small local patterns; (4) a small
amount of semi-supervision can greatly improve clustering quality for time
series; (5) the choice of the clustering algorithm matters (contrary to earlier
claims in the literature)
Distributed Deep Learning Models for Wireless Signal Classification with Low-Cost Spectrum Sensors
This paper looks into the technology classification problem for a distributed
wireless spectrum sensing network. First, a new data-driven model for Automatic
Modulation Classification (AMC) based on long short term memory (LSTM) is
proposed. The model learns from the time domain amplitude and phase information
of the modulation schemes present in the training data without requiring expert
features like higher order cyclic moments. Analyses show that the proposed
model yields an average classification accuracy of close to 90% at varying SNR
conditions ranging from 0dB to 20dB. Further, we explore the utility of this
LSTM model for a variable symbol rate scenario. We show that a LSTM based model
can learn good representations of variable length time domain sequences, which
is useful in classifying modulation signals with different symbol rates. The
achieved accuracy of 75% on an input sample length of 64 for which it was not
trained, substantiates the representation power of the model. To reduce the
data communication overhead from distributed sensors, the feasibility of
classification using averaged magnitude spectrum data, or online classification
on the low cost sensors is studied. Furthermore, quantized realizations of the
proposed models are analyzed for deployment on sensors with low processing
power
Additive Tree Ensembles: Reasoning About Potential Instances
Imagine being able to ask questions to a black box model such as "Which
adversarial examples exist?", "Does a specific attribute have a
disproportionate effect on the model's prediction?" or "What kind of
predictions are possible for a partially described example?" This last question
is particularly important if your partial description does not correspond to
any observed example in your data, as it provides insight into how the model
will extrapolate to unseen data. These capabilities would be extremely helpful
as it would allow a user to better understand the model's behavior,
particularly as it relates to issues such as robustness, fairness, and bias. In
this paper, we propose such an approach for an ensemble of trees. Since, in
general, this task is intractable we present a strategy that (1) can prune part
of the input space given the question asked to simplify the problem; and (2)
follows a divide and conquer approach that is incremental and can always return
some answers and indicates which parts of the input domains are still
uncertain. The usefulness of our approach is shown on a diverse set of use
cases
Versatile Verification of Tree Ensembles
Machine learned models often must abide by certain requirements (e.g.,
fairness or legal). This has spurred interested in developing approaches that
can provably verify whether a model satisfies certain properties. This paper
introduces a generic algorithm called Veritas that enables tackling multiple
different verification tasks for tree ensemble models like random forests (RFs)
and gradient boosting decision trees (GBDTs). This generality contrasts with
previous work, which has focused exclusively on either adversarial example
generation or robustness checking. Veritas formulates the verification task as
a generic optimization problem and introduces a novel search space
representation. Veritas offers two key advantages. First, it provides anytime
lower and upper bounds when the optimization problem cannot be solved exactly.
In contrast, many existing methods have focused on exact solutions and are thus
limited by the verification problem being NP-complete. Second, Veritas produces
full (bounded suboptimal) solutions that can be used to generate concrete
examples. We experimentally show that Veritas outperforms the previous state of
the art by (a) generating exact solutions more frequently, (b) producing
tighter bounds when (a) is not possible, and (c) offering orders of magnitude
speed ups. Subsequently, Veritas enables tackling more and larger real-world
verification scenarios
Crowdsourced wireless spectrum anomaly detection
Automated wireless spectrum monitoring across frequency, time and space will
be essential for many future applications. Manual and fine-grained spectrum
analysis is becoming impossible because of the large number of measurement
locations and complexity of the spectrum use landscape. Detecting unexpected
behaviors in the wireless spectrum from the collected data is a crucial part of
this automated monitoring, and the control of detected anomalies is a key
functionality to enable interaction between the automated system and the end
user. In this paper we look into the wireless spectrum anomaly detection
problem for crowdsourced sensors. We first analyze in detail the nature of
these anomalies and design effective algorithms to bring the higher dimensional
input data to a common feature space across sensors. Anomalies can then be
detected as outliers in this feature space. In addition, we investigate the
importance of user feedback in the anomaly detection process to improve the
performance of unsupervised anomaly detection. Furthermore, schemes for
generalizing user feedback across sensors are also developed to close the
anomaly detection loop.Comment: IEEE: under revie
An Automated Engineering Assistant: Learning Parsers for Technical Drawings
From a set of technical drawings and expert knowledge, we automatically learn
a parser to interpret such a drawing. This enables automatic reasoning and
learning on top of a large database of technical drawings. In this work, we
develop a similarity based search algorithm to help engineers and designers
find or complete designs more easily and flexibly. This is part of an ongoing
effort to build an automated engineering assistant. The proposed methods make
use of both neural methods to learn to interpret images, and symbolic methods
to learn to interpret the structure in the technical drawing and incorporate
expert knowledge
Acceleration of probabilistic reasoning through custom processor architecture
Probabilistic reasoning is an essential tool for robust decision-making
systems because of its ability to explicitly handle real-world uncertainty,
constraints and causal relations. Consequently, researchers are developing
hybrid models by combining Deep Learning with probabilistic reasoning for
safety-critical applications like self-driving vehicles, autonomous drones,
etc. However, probabilistic reasoning kernels do not execute efficiently on
CPUs or GPUs. This paper, therefore, proposes a custom programmable processor
to accelerate sum-product networks, an important probabilistic reasoning
execution kernel. The processor has an optimized datapath architecture and
memory hierarchy optimized for sum-product networks execution. Experimental
results show that the processor, while requiring fewer computational and memory
units, achieves a 12x throughput benefit over the Nvidia Jetson TX2 embedded
GPU platform
- …